Fix Qwen3 MoE identity LoRA export layout by FurtherAI · Pull Request #688 · OpenPipe/ART

FurtherAI · 2026-05-21T09:32:01Z

Summary

Fixes Qwen3 MoE step-0 identity LoRA normalization so the identity adapter is exported in the same per-expert Qwen3 MoE layout as trained checkpoints.

Qwen3 MoE identity adapters are initially created through PEFT target-parameter LoRA, which produces fused expert keys like:

mlp.experts.base_layer.lora_A/B
mlp.experts.lora_A/B

ART now expands those Qwen3 MoE identity tensors into the vLLM/Megatron-compatible per-expert layout:

mlp.experts.{expert}.gate_proj.lora_A/B
mlp.experts.{expert}.up_proj.lora_A/B
mlp.experts.{expert}.down_proj.lora_A/B

This only adds a Qwen3 MoE to_vllm_lora_tensors conversion path. Trained Qwen3 MoE adapters that are already per-expert pass through unchanged.

Also adds experts to Qwen3 MoE default target modules so vLLM wraps the routed MoE FusedMoE layer, while preserving gate_proj, up_proj, and down_proj for Megatron's per-expert LoRA wrapping.

Validation

uv run --extra megatron --group dev pytest -q tests/integration/megatron/lora/test_lora_disk_codecs.py -k "qwen3_fused_identity or qwen3_dense_and_moe"
- 2 passed, 5 deselected
yes_no_trainability workflow for Qwen/Qwen3-30B-A3B-Instruct-2507
- passed
- initial eval reward: 0.5
- final eval reward: 0.96875
- saturated step: 2
- train grad norms: 66.91, 61.89
Confirmed vLLM loaded step @0, @1, and @2 adapters.
Confirmed checkpoints 0000, 0001, and 0002 use per-expert Qwen3 MoE keys with no fused base_layer expert keys.
Confirmed vllm==0.19.0 parser accepts the same per-expert Qwen3 MoE format.

FurtherAI added 2 commits May 21, 2026 06:42

Normalize Qwen3 MoE identity LoRA layout

d439b7b

Target Qwen3 MoE experts in vLLM

a422d4b

FurtherAI marked this pull request as ready for review May 21, 2026 17:39

Kovbo self-requested a review May 21, 2026 18:21

Kovbo approved these changes May 21, 2026

View reviewed changes

FurtherAI merged commit 80a66de into main May 21, 2026
5 checks passed

FurtherAI deleted the austin/qwen3_moe_lora_codec branch May 21, 2026 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Qwen3 MoE identity LoRA export layout#688

Fix Qwen3 MoE identity LoRA export layout#688
FurtherAI merged 2 commits into
mainfrom
austin/qwen3_moe_lora_codec

FurtherAI commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FurtherAI commented May 21, 2026

Summary

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants